In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.
In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.
Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.
# Load pickled data
import pickle
# TODO: Fill this in based on where you saved the training and testing data
training_file = 'train.p'
testing_file = 'test.p'
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
The pickled data is a dictionary with 4 key/value pairs:
'features' is a 4D array containing raw pixel data of the traffic sign images, (num examples, width, height, channels).'labels' is a 2D array containing the label/class id of the traffic sign. The file signnames.csv contains id -> name mappings for each id.'sizes' is a list containing tuples, (width, height) representing the the original width and height the image.'coords' is a list containing tuples, (x1, y1, x2, y2) representing coordinates of a bounding box around the sign in the image. THESE COORDINATES ASSUME THE ORIGINAL IMAGE. THE PICKLED DATA CONTAINS RESIZED VERSIONS (32 by 32) OF THESE IMAGESComplete the basic data summary below.
### Replace each question mark with the appropriate value.
# TODO: Number of training examples
n_train = len(train['features'])
# TODO: Number of testing examples.
n_test = len(test['features'])
# TODO: What's the shape of an traffic sign image?
image_shape = train['features'].shape[1:]
# TODO: How many unique classes/labels there are in the dataset.
n_classes = len(set(train['labels']))
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
Visualize the German Traffic Signs Dataset using the pickled file(s). This is open ended, suggestions include: plotting traffic sign images, plotting the count of each sign, etc.
The Matplotlib examples and gallery pages are a great resource for doing visualizations in Python.
NOTE: It's recommended you start with something simple first. If you wish to do more, come back to it after you've completed the rest of the sections.
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
import matplotlib.pyplot as plt
# Visualizations will be shown in the notebook.
%matplotlib inline
from random import random
import numpy as np
import tensorflow as tf
import os
for (label, freq) in zip(range(n_classes), np.bincount(train['labels'])):
print (label, freq)
plt.bar(range(n_classes), np.bincount(train['labels']))
# create a dictionary of labels
from collections import defaultdict
label_dict = {}
label_dict = defaultdict(lambda: [], label_dict)
N = len(train['features'])
for i in range(N):
label_dict[train['labels'][i]].append(i)
# select a few plots randomly from each class and plot
n_cols = 8
plt.figure(figsize=[20, 100])
for i in range(n_classes):
for j in range(n_cols):
plt.subplot(n_classes, n_cols, i*n_cols+j+1)
n = int(random() * len(label_dict[i]))
image = train['features'][label_dict[i][n]]
plt.imshow(image)
plt.axis('off')